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MEASURING  VISUOSPATIAL  WORKING  MEMORY 
USING  PATH  VISUALIZATION 


Introduction 

The  term  visuospatial  working  memory  (VSWM)  refers  to  a  set  of  cognitive 
processes  that  people  use  to  visualize  spatial  configurations.  VSWM  is  not  typically 
viewed  as  permanent  visual  memory,  but  rather  as  a  temporary  workspace  for 
visuospatial  computations.  VSWM  processes  can  be  experimentally  distinguished  from 
the  processes  that  support  working  memory  for  verbal  materials  (e.g.,  Logie,  1994;  Smith 
&  Jonides,  1997).  VSWM  is  said  to  be  involved  in  virtually  all  spatial  problem  solving; 
everything  from  designing  a  product  to  visualizing  a  route  to  the  airport.  It  may  be 
particularly  crucial  for  operators  of  Uninhabited  Aerial  Vehicles  (UAVs)  because  they 
control  the  aircraft  from  a  Ground  Control  Station,  so  they  must  visualize  their  flight  path 
plus  the  positions  of  many  objects  (terrain,  threats,  other  aircraft,  reconnaissance 
objectives,  etc.),  without  the  benefit  of  a  panoramic  view  from  the  cockpit. 

Much  remains  to  be  understood  about  VSWM.  We  need  to  know  how  much 
information  it  can  hold;  what  causes  loss  of  information  from  it;  how  information  in  it 
can  be  organized  (3D  egocentric,  2D  map,  3D  allocentric,  or  some  other);  and  what 
operations  (e.g.  rotation,  expansion,  scanning)  can  be  performed  on  this  information.  To 
shed  light  on  these  issues,  we  need  ways  to  study  VSWM  objectively. 

This  paper  describes  a  new  technique  called  Path  Visualization  (PV)  for  obtaining 
quantitative  information  about  VSWM.  The  PV  paradigm  yields  an  accuracy-and 
response-time-based  quantification  of  the  mental  “space”  in  which  human  spatial 
visualization  takes  place. 

THE  PATH  VISUALIZATION  TASK 

It  is  difficult  to  find  objective  measures  of  any  mental  operation.  The  nonverbal, 
ephemeral  qualities  of  spatial  visualizations  make  them  particularly  elusive.  Path 
Visualization  is  an  objective  method  that  seems  to  avoid  at  least  one  potential 
measurement  pitfall.  In  the  PV  paradigm,  observers  do  not  merely  reproduce  a  sequence 
of  visual  stimuli,  as  is  sometimes  the  case  for  tests  of  visual  short-term  memory.  Such 
tests  allow  the  possibility  that  stimuli  could  be  encoded  and  recalled  in  a  way  that  does 
not  require  an  explicitly  spatial  representation  (for  example,  verbal  rehearsal).  In  contrast, 
the  PV  task  forces  the  observer  to  perform  a  spatial  computation  on  the  stimulus 
sequence  if  he  or  she  is  to  respond  accurately.  This  computation  requires  a  spatial 
representation  of  multiple  locations  in  a  complex  path. 

In  the  PV  task,  people  try  to  visualize  paths  that  are  described  piece-by-piece 
within  an  imaginary  space  (Figure  1).  The  space  is  typically  a  5  x  5  x  5  three-dimensional 
cube-shaped  grid.  Paths  start  at  the  center  of  this  grid.  Each  path  consists  of  a  series  of 
segments.  Each  segment  consists  of  a  direction  and  a  distance,  where  distances  are  given 
in  units  on  the  imaginary  grid.  Segment  descriptions  can  be  given  using  synthetic  speech. 
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as  text  on  a  monitor,  as  arrows  or  lines  in  a  diagram,  or  as  a  visual  depiction  of  virtual 
self-motion.  In  the  standard  PV  task,  text  or  speech  descriptions  are  used,  and  distances 
are  always  one  unit.  Directions  can  be  fixed  with  respect  to  the  axes  of  the  grid,  or  they 
can  be  described  with  respect  to  an  observer  moving  along  the  path.  For  example,  the 
fixed  (absolute)  segment  descriptions  for  a  short  square  path  which  returns  to  its  origin  at 
the  center  would  be;  Forward  1  unit;  Left  1  unit;  Back  1  unit;  Right  1  unit.  The  relative 
(ego-referenced)  segment  descriptions  for  this  same  path  would  be  Forward  1,  Left  1, 
Left  1,  Left  1. 


1 .  Participant  views  or  hears  a  description  of  a  path  segment. . . 

Up 


Left 
Back 


Forward 
Right 

Down 


2.  ..mentally  adds  the  segment  to  a  path  in  imaginary  5x5x5  space. 


decides  if  the  new  segment  intersects  with  the  existing  path. 


and  indicates  the  decision  with  a  keypress 

Figure  1.  General  description  of  the  Path  Visualization  task. 

On  each  trial,  a  single  15-segment  path  is  presented.  Before  initiating  a  trial,  the 
study  participant  must  have  his/her  left  index  finger  on  the  Left- Arrow  key  (not  the 
numeric  keypad)  and  his/her  right  index  finger  on  the  Right- Arrow  key.  A  trial  is 
initiated  by  pressing  the  keypad  Enter  key  with  the  little  finger  of  the  right  hand.  Then  a 
sequence  of  15  segment  descriptions  is  presented.  Each  segment  description  is  displayed 
for  2  s.  The  participant’s  task  is  to  mentally  construct  a  path  using  these  segment 
descriptions,  adding  each  new  segment  to  the  path  as  it  is  presented. 

To  verify  the  accuracy  of  the  participant’s  visualized  path,  after  each  segment  is 
presented,  he/she  must  press  a  key  as  quickly  as  possible  indicating  whether  or  not  the 
new  segment  intersected  with  any  previously  presented  part  of  the  path.  The  participant 
must  press  the  Eeft- Arrow  key  if  he/she  believes  that  the  endpoint  of  the  segment  just 
presented  did  not  revisit  (intersect)  any  location  from  any  of  the  previously  presented 
segments,  including  the  central  starting  location.  Pressing  the  right  arrow  key  indicates 
the  belief  that  the  endpoint  of  the  most  recent  segment  did  revisit  one  of  the  locations  that 
are  part  of  the  path  presented  so  far.  If  either  key  is  pressed  during  the  display  of  the 
segment,  the  reaction  time  and  accuracy  of  the  keypress  is  recorded  and  the  segment 
display  continues  until  the  2-s  display  time  has  elapsed.  If  no  key  is  pressed,  the  response 
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is  scored  as  a  timeout.  Whether  or  not  a  key  is  pressed,  the  next  segment  of  the  path  is 
presented  after  the  prior  display  has  been  shown  for  2  s  and  a  133-ms  blank  sereen  has 
been  presented.  After  all  15  segments  have  been  presented,  a  feedbaek  sereen  appears, 
giving  information  about  reaction  time  and  aeeuraey  for  the  trial.  This  information 
ineludes  mean  reaction  times  for  correetly  identified  interseetions,  correetly  identified 
non-interseetions,  interseetions  missed,  and  non-interseetions  ineorreetly  identified  as 
interseetions  (false  alarms).  Pressing  the  Enter  key  initiates  the  next  trial. 

The  PV  task  is  similar  to  some  existing  methods  (e.g.,  Attneave  &  Curlee,  1983; 
Barshi  &  Healy,  2002;  Brooks,  1968;  Carlson  &  Sohn,  2000;  Diwadkar,  Carpenter  & 

Just,  2000;  Kerr,  1987,  1993;  Veeehi  &  Girelli,  1998)  in  that  it  requires  partieipants  to 
keep  traek  of  changes  in  the  position  of  a  point  in  a  two-dimensional  or  three-dimensional 
array  of  loeations.  However,  sueeess  at  Path  Visualization  requires  more  than  tracking  a 
single  point,  whieh  eould  possibly  be  aeeomplished  using  some  mathematieal  reeoding 
rather  than  visualization.  In  PV,  both  the  eurrent  end  position  and  the  prior  path  must  be 
held  in  memory  in  order  to  determine  whether  an  interseetion  has  oecurred. 

Many  variations  of  the  PV  task  are  possible.  Comparing  performanee  on  different 
variations  ean  shed  light  on  various  issues  in  the  modeling  of  human  spatial  visualization. 
Here  are  four  examples  of  issues  eurrently  being  addressed  using  the  PV  task: 

Are  2D  and  3D  Representations  Handled  the  Same  Way  in  Visuospatial  Working 
Memory? 

A  eontroversial  issue  in  the  study  of  spatial  visualization  is  the  dimensionality  of 
the  mental  representation.  At  one  extreme  is  the  notion  that  somewhere  in  the  brain  there 
exists  a  three-dimensional  analog  projeetion  area  for  representing  3D  spaee.  The  other 
extreme  is  the  idea  that  spatial  information  is  represented  propositionally,  in  much  the 
same  way  as  nonspatial  information  (e.g.,  Hinton,  1979).  A  popular  middle  ground,  at 
least  for  visuospatial  imagery,  has  been  various  kinds  of  array-like  theories,  e.f,  Kosslyn 
(1980,  1994),  in  whieh  an  analog  2D  projection  area  suffices  for  both  2D  and  3D 
information.  Additional  possibilities  (ineluding  a  Marr-like  2V2-D  sketeh)  have  also  been 
proposed  (Pinker,  1988).  It  is  also  possible  that  representations  of  2D  and  3D  spaee  use 
different  meehanisms,  and  therefore  might  be  distinet  abilities.  Indeed  Comoldi,  Cortesi 
and  Preti  (1991),  using  the  Kerr  (1987)  location  tracking  task,  showed  that  eongenitally 
blind  participants  are  as  good  as  sighted  partieipants  in  traeking  2D  location,  but  the  blind 
have  partieular  diffieulty  traeking  loeations  in  three  dimensions. 

Evidence  potentially  relevant  to  this  issue  can  be  obtained  by  eomparing 
performance  on  different  kinds  of  paths  in  the  PV  task.  Paths  ean  be  restricted  to  lie  on 
horizontal,  eoronal  or  sagittal  planes  so  that  working  memory  for  2D  versus  3D  struetures 
ean  be  assessed.  Planar  paths  ean  be  embedded  among  3D  paths  so  that  partieipants  will 
not  know  before  the  end  of  the  path  whether  it  is  2D  or  3D.  Therefore  any  differences  in 
performanee  between  2D  and  3D  paths  are  unlikely  to  be  due  to  the  selection  of  different 
strategies.  Of  eourse,  differenees  in  performanee  for  2D  and  3D  paths  must  be 
interpreted  in  the  light  of  differenees  in  eonneetivity  (max.  of  4  for  2D;  6  for  3D, 
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assuming  orthogonal  directions),  and  other  possible  differences  in  the  characteristics  of 
random  paths  generated  in  2D  vs  3D. 

Do  Ego-Referenced  and  Fixed-Coordinate  Descriptions  Use  the  Same  Spatial 
Memory  System? 

As  noted  earlier,  paths  in  the  PV  task  can  be  described  using  fixed-coordinate 
descriptors,  in  which  the  labels  "up,"  "down,"  "right,"  etc.,  always  refer  to  the  same 
directions  in  an  external  coordinate  system,  or  ego-referenced  descriptors,  in  which  the 
labels  are  relative  to  an  imaginary  observer  moving  along  the  path.  These  two  kinds  of 
descriptors  are  related  to  two  different  representations  of  space.  In  the  context  of  the  PV 
task,  an  egocentric  representation  means  that  the  participant's  mental  viewpoint  is  along 
the  path,  whereas  in  an  allocentric  representation,  the  path  is  viewed  from  somewhere 
else  (for  example,  looking  down  from  above).  There  is  no  logically  necessary  connection 
between  types  of  path  descriptors  and  types  of  spatial  representations;  one  can  perform 
mental  transformations  to  obtain  either  kind  of  representation  from  either  kind  of 
descriptor.  There  is,  however,  a  strong  natural  correspondence  between  ego-referenced 
descriptors  and  an  egocentric  representation,  and  between  fixed-coordinate  descriptors 
and  allocentric  representation. 

Both  egocentric  and  allocentric  representation  are  critical  to  the  ability  to  navigate 
in  the  world,  but  they  are  quite  different.  Egocentric  representation  arises  naturally  when 
moving  through  space;  allocentric  representation  is  natural  for  a  map-like  depiction. 

There  is  neurophysiological  evidence  that  the  brain  constructs  both  kinds  of 
representation,  each  in  a  different  area  (O'Keefe,  1992;  Stein,  1992).  This  functional  and 
physiological  differentiation  suggests  that  there  may  be  separate  systems  for  egocentric 
and  allocentric  spatial  computations,  and  thus,  perhaps,  separate  abilities. 

The  PV  task  can  be  given  with  either  fixed-coordinate  or  ego-referenced  path 
descriptors.  Since  all  other  aspects  of  the  task  are  the  same  with  either  descriptor  type, 
PV  provides  a  clean  comparison  of  performance  given  path  descriptors  from  different 
frames  of  reference. 

Are  Different  Parts  of  Visualized  Space  Equally  Well  Represented? 

All  of  visualized  space  may  not  be  equally  well  represented.  There  may  be 
differences  in  representation  between  near  and  far,  or  upper  and  lower  parts  of  a 
visualized  3D  space.  It  is  also  possible  that  some  individuals  show  evidence  of  "spatial- 
image  scotomas,"  areas  of  imagined  space  that  may  not  be  well  represented,  perhaps  akin 
to  the  visual  neglect  shown  by  patients  with  parietal  damage.  The  homogeneous  metric 
character  of  the  PV  task  makes  it  possible  to  look  for  such  effects  in  a  normal  population. 
Given  a  sufficient  number  of  trials  per  participant,  mean  response  time  and  accuracy  can 
be  computed  for  different  regions  of  3D  mental  image  space. 
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How  Does  Linguistically  Described  Space  Differ  From  Visually  Experienced  Space? 

UAV  pilots  typically  experience  the  spaee  traversed  by  the  aireraft  via  a  video 
feed.  Flying  using  the  video  feed,  when  the  eamera  is  loeked  to  the  direetion  of  flight,  is 
similar  to  flying  a  PC  flight  simulation  game.  The  pilot  is  looking  at  a  30-degree  field- 
of-view  seene  displayed  on  a  monitor.  The  pilot  may  also  hear  aspeets  of  spaee  deseribed 
in  radio  eommunieations. 

Experieneed  space  (the  video  feed)  might  be  proeessed  in  the  brain  by  an  episodic 
memory  system  in  whieh  visual  details  and  impressions  of  speed  and  distanee  are 
preserved.  Linguistieally-deseribed  spaee  might  be  eneoded  differently.  How  might  the 
representations  of  these  two  sourees  of  spatial  information  differ?  Are  eapacity  limits 
different  in  the  two  representations?  Do  they  use  different  rules  of  organization?  Are  the 
patterns  of  likely  spatial  errors  different?  Do  they  refleet  the  same  individual  spatial 
abilities? 

To  address  these  issues,  one  must  have  a  way  of  presenting  the  same  spatial 
information,  requiring  the  same  response,  in  both  linguistie  and  visual-motion  forms,  and 
without  eonfounding  effeets  of  speeifie  knowledge  or  strategies.  This  ean  be 
aeeomplished  using  the  PV  task  beeause  the  same  paths  ean  be  presented  using  either 
linguistie  deseriptions  (c.f,  Franklin  &  Tversky,  1992)  or  a  visually  rich  virtual  fly- 
through. 


CONCLUSION 

The  Path  Visualization  task  provides  an  objective  method  for  studying  temporary, 
eomplex  spatial  eonstruetions  in  visuospatial  working  memory.  It  provides  both  aeouraey 
and  response-time  data  under  different  visualization  eonditions  and  for  different  regions 
of  2D  and  3D  visualization  spaee.  Information  about  eurrent  researeh  using  the  PV  task 
and  software  for  presenting  various  PV  eonditions  ean  be  obtained  by  eontaeting  the 
author. 
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